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Abstract. 

RNA molecules follow a succession of enzyme-mediated processing steps from 
O I transcription until maturation. The participating enzymes, for example the 

r^ • spliceosome for mRNAs and Drosha and Dicer for microRNAs, are also produced 

in the cell and their copy-numbers fluctuate over time. Enzyme copy-number changes 
affect the processing rate of the substrate molecules; high enzyme numbers increase 
the processing probability, low enzyme numbers decrease it. We study different RNA 
^ \ processing cascades where enzyme copy-numbers are either fixed or fluctuate. We 

^D ' find that for fixed enzyme-copy numbers the substrates at steady-state are Poisson- 

^*r . distributed, and the whole RNA cascade dynamics can be understood as a single 

.^ I birth-death process of the mature RNA product. In this case, solely fluctuations in the 

timing of RNA processing lead to variation in the number of RNA molecules. However, 
f~^ , we show analytically and numerically that when enzyme copy-numbers fluctuate, the 

CO ' strength of RNA fluctuations increases linearly with the RNA transcription rate. This 

linear effect becomes stronger as the speed of enzyme dynamics decreases relative to the 
speed of RNA dynamics. Interestingly, we find that under certain conditions, the RNA 
k> ' cascade can reduce the strength of fiuctuations in the expression level of the mature 

%^ , RNA product. Finally, by investigating the effects of processing polymorphisms we 



show that it is possible for the effects of transcriptional polymorphisms to be enhanced, 
reduced, or even reversed. Our results provide a framework to understand the dynamics 
of RNA processing. 



1. Introduction 

The copy-number of enzymes that mediate particular reactions is a source of intrinsic 
fluctuations in gene expression [1]. An enzyme chemically converts its substrate at a 
rate which may be either fixed, or may change over time. However, even for a fixed 
rate of chemical conversions per enzyme, prolonged changes in the enzyme copy-number 
become noticeable at the level of the enzyme's substrate. Consequently, the substrates 
convert at a rate that fluctuates over time. In this work, we consider the effect of such 
fluctuations on the production and processing of mRNA, microRNA (miRNA) or small 
interfering RNA (siRNA). 

The principal enzyme involved in mRNA-processing is the spliceosome, which 
removes intronic elements from precursor mRNA molecules [2, 3]. In the biogenesis of 
eukaryotic small RNAs (sRNA), encompassing miRNAs and siRNAs, one additionally 
finds (i) the microprocessor unit responsible for processing primary miRNAs, (ii) the 
RNA-dependent RNA polymerase responsible for synthesizing complementary strands 
to single-stranded RNA, and (iii) the nuclease Dicer responsible for processing precursor 
sRNA molecules [1]. In the biogenesis of prokaryotic sRNA one finds the Cascade/Cas 
enzymes performing similar processing steps [5]. The processed eukaryotic or prokaryotic 
sRNA molecules are then loaded to Argonaute proteins forming the so called RNA- 
induced silencing complexes (RISCs), the units responsible for the post-transcriptional 
regulation (PTR) of mRNA transcripts [6, 7]. 

The enzymes involved in the mRNA and sRNA biogeneses are vital for the normal 
functioning of the cell and are often regulated by complex feedback networks. For 
example, miR162 targets the Dicer DCLl mRNA in Arahidopsis thaliana, but the 
precursor of miR162 needs Dicer to mature [(S]. Hence large expression levels of 
Dicer lead to large levels of the miRNA that in turn downregulates Dicer. However, 
the presence of tight regulation of the enzymatic expression levels does not eliminate 
completely fiuctuations in their copy-numbers. At best, regulation provides a mechanism 
to limit the range of fiuctuations around the mean value of enzymatic copy-numbers at 
steady-state. 

Studies of the protein biogenesis usually assume mRNA is produced via a birth- 
death process [9, 10]. Studies of post-transcriptional regulation involving sRNAs 
assume the same [11, 12, 13, 14, 15]. This assumption that the multi-step process 
of RNA biogenesis can be modeled by a single birth-death process of the mature RNA 
product requires examination. We address this question by studying the mRNA and 
sRNA biogeneses using models that explicitly include the sequential processing of RNA 
precursors by the different enzymes. We find that under certain conditions one can 
indeed replace the complex RNA-processing cascade by a single production process of 
constant rate. In that case, mature RNA production events are statistically independent 
and follow a homogeneous Poisson process. However, outside the validity of these 
conditions, this simple picture breaks down since fiuctuations in a RNA processing 
cascade cannot be captured by a single homogeneous Poisson process. In this case. 



enzymatic copy-number fluctuations introduce statistical dependencies in the RNA- 
processing events. 

2. Results 

We start with the simplest scenario possible, a chain of RNA-processing steps. Such 
RNA-processing cascades arise in different contexts. 

2.1. mRNA biogenesis 

The biogenesis of messenger RNA starts with the transcription from DNA of the primary 
RNA transcript (pre-mRNA„), and continues as the spliceosome excises introns until 
the final mRNA product is reached. If we assume each of these steps takes place at a 
constant rate, the cascade of events can be depicted schematically as follows 

-H pre-mRNA„ ^ 0, 

pre— mRNA„ -^ pre— mRNA„_^ -^ 0, 



(1) 



pre-mRNAi ^ mRNA ^ 0, 



where kn is the gene transcription rate, kn-i,...,kQ the precursor mRNA processing 
rates, and dn, ■ ■ ■ ,do the basal degradation rates of all mRNA products. The length of 
the cascade is determined by the number of introns n of the primary transcript. We 
will show that at steady-state and under certain conditions, each of the components of 
the mRNA cascade is Poisson-distributed, and the mRNA creation dynamics follows a 

homogeneous Poisson process. In that case, (1) can be replaced by — % mRNA — ^ 
if ko is chosen to match the effective production rate of mRNA at steady-state in (1). 

2.2. siRNA biogenesis 

A second example of a chain of RNA processing is the biogenesis of endogenous 
small interfering RNAs (siRNA). This process starts with the transcription of genes or 
transposable elements to single-stranded RNA (ssRNA). The ssRNA is then converted 
by RNA-dependent RNA polymerase (RDR) to double-stranded RNA (dsRNA), which 
is in turn cleaved by Dicer nuclease to mature siRNAs. The biogenesis of exogenous 
siRNAs, involved for example in infections or transfections, is delocalized with only the 
last processing step occurring at the place where the siRNAs operate. In any case, if 
we assume that each step takes place at constant rate, then the siRNA cascade can be 
depicted as follows 



(2) 
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This reaction scheme is analogous to the mRNA-processing chain (1) and can be 
simphfied under certain conditions to a single birth-death process as discussed below. 

Not included in this scheme is the biogenesis of trans-acting small interfering 
RNA (tasiRNA). For this particular class of siRNAs, the assumption of constant rate 
of production /c2 of ssRNA in (2) is invalid: the tasiRNA biogenesis initiates when 
fragments of post-transcriptionally regulated mRNAs, cleaved with the help of miRNAs, 
are converted to dsRNA by RDR [16, 17]. This process can fluctuate considerably 
resulting in temporal changes in the production rate k2 of ssRNA [15]. 

2.3. miRNA biogenesis 

MiRNAs originate from either intergenic DNA regions having their own promoters, 
or intragenic DNA regions of introns of protein-encoding genes [4]. The latter can 
vary in length: short intragenic fragments called mirtrons are processed directly by the 
spliceosome [18], whereas longer intragenic fragments are additionally processed by the 
microprocessor unit in animals [19], and by Dicer in plants [19, 16]. There is evidence 
that the processing order might matter, and also that the microprocessor unit and the 
spliceosome collaborate in their tasks [20]. The latter suggests that these two processing 
steps might not be entirely independent. When the microprocessor unit (or Dicer in 
plants) acts flrst, the intragenic miRNA biogenesis can be depicted as follows 

-^ pri-miRNApre-mRNA -^ 0, 

pri-miRNApre-mRNA -^ pre-miRNA -^ 0, (3) 

pre-miRNA ^ miRNA ^ 0, 

where pri-miRNApre-mRNA denotes the long RNA hairpin consisting of the primary 
miRNA intragenic element and the precursor mRNA fragment. If the spliceosome 
acts flrst on the long RNA hairpin it produces pri-miRNA from the intronic fragment, 
which is consequently processed into pre-miRNA by the microprocessor unit (or Dicer 
in plants). Mirtrons have a biogenesis identical to (3), but have shorter pri-miRNAs 
and the processing of the long primary RNA hairpin is performed by the spliceosome 
only [18]. 

Intergenic miRNA follows a similar biogenesis as intragenic miRNA: 

^ pri-miRNA ^ 0, 

pri-miRNA ^ pre-miRNA ^0, (4) 

pre-miRNA ^ miRNA ^ 0. 

The only difference between intergenic and intragenic miRNA biogenesis is the 
additional production of mRNA for the latter after the spliceosome excises all introns 
from the pri-miRNApre-mRNA hairpin. Both (3) and (4) have a structure similar to 
the mRNA cascade (1). When reaction rates are constant we show below that the 

miRNA cascade can be replaced by — % miRNA — % 0. If the host transcript in the 
intragenic miRNA biogenesis is also a member of the network investigated, then it is 
straightforward to include its biogenesis separately. 



A complication outside the applicability of (3) arises if the mature intragenic 
miRNA regulates the host transcript post-transcriptionally. The majority of intragenic 
miRNAs investigated across species (80%) are predicted not to target their hosts [21]. 
On the other hand, the biogenesis of the remaining 20% couples the production rates of 
miRNA and target, and may force PTR to operate close to the so called "derepression 
threshold" [12, 22], where targets are expressed at levels that are just sufficient to 
overcome repression via PTR. In this regime of target expression and beyond, the RISC- 
formation and RISC-recycling processes play a prominent role: they control the strength 
of PTR- induced fluctuations in the target transcript levels [15]. Post-transcriptional 
regulation and feedback loops [23, 14] are not within the scope of our work. We focus 
solely on RNA-processing in order to understand its underlying dynamics. 

2.4- Unifying the RNA cascades under constant reaction rates 

The mRNA (1) and sRNA (2-4) biogeneses under constant reaction rates are all special 
cases of the following generic cascade 

^ X Z^ 0, 

dx 

ky J^ (5) 

dy 

y — > z — y 0. 
Here, the processing steps are broken down into two sub-steps: one step involving the 
destruction of the ancestor precursor, and one step involving the creation of the new 
product. The reason we choose this representation is because it separates processes 
according to intermediate components. However, both processing reactions in (5) are 

k 

not independent but rather take place simultaneously. For example, the process x — ^ 
never takes place without the partner reaction x — ^ y. 

Below we show analytically that at steady-state the x, y, and z products of (5) are 
Poisson-distributed. The rest of the details in (1-4) not included in (5), for example 
the number of introns excised in (1) influence the average steady-state expression level 
of the mature product, i.e., the mean value of the Poisson distribution. All else being 
equal, the expression levels of two mature mRNAs for example, might be different if 
one has introns in its precursor and the other does not, but they are both going to be 
Poisson-distributed at steady-state if processing rates per precursor mRNA are constant. 

2. 5. Solution of the RNA biogenesis under constant reaction rates 

We consider the simplest scenario in some detail here as the same tools will be used 
to treat the effects of fluctuating reaction rates in section 2.7. The master equation 
describing the generic RNA biogenesis (5) is 

^ = [K {£- - 1) +4 (C -i)x + k, {£t£- - i)x 
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+dy {£+ -l)y + h {£^£7 - 1) y + 4 [St - 1) ^] P^,y,z, (6) 

where Px,y,z{t) is the probabihty for the molecular numbers in the system at a given 
time to be x,y,z. The shift operators S^ are defined by S^g{n) = g{n ± 1), where 
n G {x, y, z}. Using the generating function /(r, s, g, t) = ^^ ^ f^s^q^px^y^zif) in (6) we 
arrive at 

^J^MA =,,(,_ !)/(,, ,, ,, t) + [4,(1 - r) + l,(, - r)) ^J^I:IllA 

+ K(i - .) + kM - .)] ^^^^ + 4(1 - ,)Mn£iM. (7) 

At steady-state the product of three generating functions of Poisson distributions 
f{r,s,q) = e<^)('^-i)e<s'>("-i)e<^>(«-^) solves (7) when 

(a;) = ^^ (^) = Al:^, (2) = M^. (8) 

Consequently, the steady-state distribution of 2; in the cascade (5) is identical to the 

k, d- 

steady-state distribution of z in the single birth-death process — ^ z — ^ 0, where 
kz = kz{y)- As expected for Poisson-distributed quantities, we find (r^) — (r)^ = (r) for 
r = {x,y, z}, and computing the Fano factor Fr, defined as the variance over the mean 
of the random variable r, we obtain F^ = Fy = Fz = 1. 

So far we have proved that at steady-state the mature product of (5) follows Poisson 
statistics. However, we have not shown that the statistics production events of z at 
steady-state in (5) is identical to the production statistics of z for a constant effective 
rate k^- This question will be addressed in section 2.8 after we investigate the case of 
RNA biogenesis with fluctuating processing rates. 

2.6. Synergistic or antagonistic effects of polymorphisms in sRNA biogenesis 

Variation across organisms in sRNA processing dynamics can be two-fold: (i) the same 
precursor sRNA can be processed differently by the RNA-processing enzymes producing 
several isoforms of the mature sRNA [1], or (ii) different precursor sRNAs from different 
loci or different alleles across species can have different processing rates but produce 
identical mature sRNAs [24, 25, 26, 27]. In the first case, the processing variation 
affects the efficiency of recruitment of the mature sRNA isoforms by the Argonaute 
proteins [28], and consequently the recycling rate of recruited mature sRNAs after they 
have catalyzed a transcript-targeting event [6, 29]. Both of these effects have been 
addressed elsewhere [15]. Here, we investigate the case of differences in the processing 
rates between two precursor sRNAs that give rise to identical mature sRNA products. 
We assume two miRNA alleles {miRi^2) produce identical mature miRNA products, 
but have differences in the transcription and processing rates of their respective pri- 
miRNAs. In particular, we assume that miRi is transcribed at a faster rate than miR2 
{kx-i > /cxa) but the pri-miRNA of miRi is processed at a slower rate {ky-^ < kyf). Is it 
possible that the steady-state copy-number zi of the mature miRNA product of miRi 



to be less than z^^ the identical corresponding mature miRNA product of miR'p. The 
ratio of the steady-state expression levels of (8) yields 



^\ _ ^x\ 1 + (^xl i^yz 



(9) 



As long as pri-miRi is processed at a lower rate than pri-niiR2 (ky^ < ky^), and despite 
the fact that miRi is transcribed at a higher rate than mii?2 {kx^ > k^^), the steady-state 
expression level of niat-iniR2 can still be higher [zi < Z2)- Similar results are obtained 
if variation is present in the processing rates of the precursor miRNAs instead of the 
primary miRNAs. In other words, the effects of polymorphisms in the transcription 
of sRNAs can be reversed or enhanced with the appearance of polymorphisms in their 
processing steps. 

2. 7. RNA biogenesis with fluctuating processing rates 

So far we considered mRNA and sRNA biogeneses with constant reaction rates and 
showed that at steady-state the mature RNA follows Poisson statistics. Now we consider 
changes in the processing rates due to fluctuations in enzyme copy-numbers. We work 
with the generic RNA cascade (5). 

We define at a given instant the number of molecules of the enzymes that process 
RNA transcripts at the first and second step in (5) as a and /3, respectively. Taking 
into account variations in enzyme copy-number RNA biogenesis becomes 

^ 

' " ,. (10) 

y — > z — > 0, 

where the replacements ky — > kya and kz — )■ kz/3 are made in (5). That is, ky is now the 
constant conversion rate per x molecule and per a enzyme, however the processing rate 
kya(t) per x molecule fluctuates over time. The same applies for the processing rate 
k-zPit) per y molecule. 

We denote as 1/ n the characteristic time scale over which enzyme expression level 
variation occurs. If this characteristic time scale is much slower than the dynamics 
associated with the biogenesis of RNA, then we can use the results of (8) for given 
values of a, /3 and ensemble-average over the equilibrium distributions of a, /3. The 
Fano factors F^ for r = {x, y, z} are given by 

where E^^/^ [■] indicates averaging over the equilibrium distributions of a, (3, and ¥[Sr] 
are Fano factors of the stochastic variables 5*,. listed below 
^ ^ 1 g ^ kya/dy 1 ^ ^ kya/dy k^P/dz 

1 + kya/dx' ^ 1 + kyo/d^l + kzl3/dy' ^ 1 + kyo/d^l + kz/S/dy 



J^ 


X 


kya 

> 

dx 


0, 


kya 


y 


> 

dy 


0, 
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No assumption has been made about the particular form of the distributions of a, /3. The 
only assumptions are: (i) the distribution of enzyme expression levels is in steady-state, 
and (ii) the enzyme biogenesis dynamics is slower than the RNA biogenesis dynamics, 
allowing us to employ the adiabatic approximation. Formally, the latter condition is 
expressed as k <C inax{kya, kz/3, dx,dy,dz}. 

How do fluctuations in a, P introduce a linear dependence on k^ in (11)? The answer 
lies in the way enzyme copy-number fluctuations imprint on substrates. Any change in 
a or f3 during time intervals that are similar or longer than the typical time interval of 
substrate biogenesis is felt by the corresponding substrate. The strength of this change 
is always related to the abundance of the corresponding substrate which in turn is 
proportional to k^- Therefore, the more abundant that substrate become, the larger that 
the effect of enzyme copy-number fluctuations becomes on them as well. Furthermore, 
the strength of this effect depends also on the timescale of enzyme biogenesis. For 
example, faster enzyme copy-number fluctuations have less of an effect over the longer 
time scales of substrate biogenesis. In this case, processing of the substrate seems to be 
taking place under almost constant rates and the Fano factors of (11) tend to unity. On 
the other hand, slower fluctuations in the enzyme biogeneses induce slower changes in 
the processing rates, which are perceived by the substrates and render their biogeneses 
more noisy. 

The theoretical results of (11-12) for the strength of fluctuations in the RNA 
cascade are independent of the details of the enzyme biogenesis. However, in numerical 
simulations we limit for simplicity the enzyme biogenesis to the following birth-death 
processes 

— > a — > 0, /.ON 

where A, B are the enzymatic expression levels at steady-state and 1/ n governs the 
time scale during which enzyme expression levels remain constant. In Figure 1 we 
numerically test the predictions of (11) based on the enzyme biogeneses of (13). We 
show the linear dependence on k^ of the Fano factors computed from 10^ simulations 
of (10-13) using the Gillespie algorithm [30] with x, y and z collected after the system 
reached a steady-state. The kinetic parameters used in the simulations correspond to the 
siRNA biogenesis of Salmonella [31]. While prokaryotes and eukaryotes have different 
sRNA biogeneses, differences are quantitative rather than qualitative. Replacing the 
microprocessor/Dicer activity with the Cascade/Cas activity [0], then (4,10) are still 
applicable. We numerically simulated (10-13) also in the range of parameters associated 
with mammalian mRNA expression [32], and miRNA activity in mammals [33, 34, 35] 
and obtained similar results in each case (data not shown). 

In the main plot of Figure 1, Fano factors order according to F^ < F^ < Fy for 
different values of the transcription rate and the speed of fluctuations in enzyme copy- 
numbers {k, > 0.1/h). It seems that the two-step RNA processing cascade (10) amplifies 
fluctuations in y but filters fluctuations in z under certain conditions and to a certain 
extent. Fluctuations in the intermediate product y are expected to be stronger than 
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fluctuations in the transcript x, because both the birth and death rates of y vary with 
time, whereas only the death rate of x varies with time. However, the birth rate of the 
mature product z follows closely the fluctuations that y undergoes. One would expect 
the possibility F^ > Fy to arise as well. These observations are reflected in (12), where 
Sx depends only on a, whereas both Sy, Sz depend on a and (3. In fact, Sy and S^ 
differ only by the term Sz/Sy = kzP/dz- This term, along with the speed of enzyme 
fluctuations k, determines the differences observed among Fy and Fz in Figure 1. For 
given K, if kzP/dz ^ 1 then fluctuations in the ancestor substrates are amplified in the 
mature RNA product, whereas noise filtering takes place if kzP/dz <^ 1. During a given 
time interval, when kz(i S> dz-, there are on average many more ^-production events due 
to ^/-processing than 2;-degradation events and noise amplification occurs, whereas when 
kzP <^ dzi the ^-degradation events overwhelm the ^-production events resulting into 
noise filtering. This effect shows up in the inset of Figure 1 showing results for k = 0.1/h 
and a twice as stable mature RNA product {dz = 0.5/h) compared to the main plot. 
In this case, we observe F^ — Fy < Fz'. the RNA cascade amplifies the strength of 
fluctuations in the mature RNA product (blue lines). However, when the processing 
rate is reduced kzf3 ^ dz, the situation reverses again (red lines) and F^ — Fy > Fz'. 
the RNA cascade buffers the noise in the expression level of the mature RNA product 
as is also observed in the main plot of the figure. 

The majority of mature RNA products are expected to be more stable than their 
precursors. For example, mature miRNAs when loaded to Argonaute proteins are 
stabilized, and in certain cases in vitro half-lives become longer than a day [35, 36]. 
Mature transcripts are also protected by 5'-capping and polyadenylation. Thus for most 
cases we expect dz < {dxidy}, leading to the amplification of fluctuations through the 
processing cascade. On the other hand, in cases where the processing rate per precursor 
molecule is lower than dz-, or the mature RNA product is unusually unstable, the RNA 
cascade will operate in the reverse regime and reduce the strength of fluctuations in the 
expression level of the mature RNA product. 

2.8. Dynamics of production events in the RNA cascade 

We have shown that in the absence of enzymatic copy-number fluctuations any product 
of the cascade (5) is Poisson-distributed at steady-state. We now show that the dynamics 
of z production in (5), and thus correlations to any order, are identical to the dynamics 
and correlations of a homogeneous Poisson process of rate kz- For simplicity, let us 
investigate a variation of (5) consisting only of a single processing step 

^ X ^ y ^ 0. (14) 

dx 

The addition of more processing steps is straightforward to handle. Since enzyme copy- 
numbers in (14) are fixed, x and y are Poisson-distributed at steady-state. Additionally, 

the dynamics of a; is a birth-death process — ^ x - — > with constant birth and death 
rates. In other words, the creation of x molecules is a homogeneous Poisson process. 
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Once an x particle is created, it either decays, or after a time interval it is converted into a 
y molecule. For constant processing rates, this time interval is exponentially distributed 
and the time instants of ^/-creation events become uniformly distributed. Considering all 
such ^/-creation events originating from the rest of the x molecules, introduces random 
shifts into the time intervals of ^/-creation events. However, the distribution of the time 
instants of y-creation events remains uniform, and the underlying ^/-creation process 
remains a homogeneous Poisson process with a rate of ky{x). Therefore, the statistics 

of the process - — y y becomes identical to the statistics generated by the process 

k (x) 

-^ y. The same logic applies if more steps are added in the processing cascade (14). 
As we discuss below, this implies that the autocorrelation function of any component 
of the cascade is identical to the autocorrelation function of a birth-death process at 
constant effective rates. 

The situation is different when the copy-number of the processing enzyme a(t) 
fluctuates over time, leading to changes in the x-processing rate kya{t). All x molecules 
present at a given time are either processed at a higher or at a lower rate depending 
on the value of a{t). According to our discussion so far, within the time interval of the 
order of 1/k where a(t) is constant, all y-creation events follow a homogeneous Poisson 
process of rate kya(t){x). As a{t) changes however, this rate changes also, resulting to 
inhomogeneities of size of the order of 1/k, in the overall distribution of y-creation times. 

Consequently, the homogeneous Poisson process - — ;■ y with uniformly distributed 

T re • • 1 kya(t)x{i) 

m time y-creation events produces dirterent statistics than the process — )■ y. 

2.9. Autocorrelation function of the mature RNA product 

The autocorrelation function Cz{t) = {z(t + r)z{t)) — {z{t + T)){z{t)) is a measure that 
identifies temporal correlations in the copy-number of the mature RNA product z in the 
cascades (5,10). When r — > 0, one expects z{t + r) to be highly correlated with z{t). 
When r —)■ oo one expects of them to decorrelate, that is, one expects all "memory" of 
the value of z{t) to be lost when we resample z at a much later time-point. 

In Figure 2 we plot C^(r)/Cz(0) for three different systems: (i) the simple multistep 
cascade (5) with constant reaction rates, (ii) a single birth-death process of constant 

rates — ^ z — ^ with k^ = kz{y) and (y) taken from (i) at steady-state, and 
(iii) the full multistep cascade (10) with a speed of enzyme copy-number fluctuations 
determined by k = 0.1/h. For any range of r, there is no significant difference in the 
autocorrelation function between (5) and a single birth-death process of constant rates. 
In line with our previous discussion, differences appear when enzymatic fluctuations are 
included. In this case, enzyme fluctuations taking place over l/zt-sized time intervals 
affect the corresponding substrates. The memory of this effect across all z is embedded 
in the autocorrelation function: temporal correlations persist over intervals of the order 
of 1/k. 
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2.10. Effects due to RNA-processing remain prominent in the presence of transcription 
bursts 

Gene transcription is a process with a significant level of intrinsic noise [37] already 
without fluctuations from RNA processing. A strong promoter ensures that a gene 
is transcribed most of the time, and pauses occur over short intervals only. A weak 
promoter on the other hand, results into longer pauses of transcription leading to strong 
fluctuations in the copy number of primary transcripts. RNA-processing takes place 
downstream of transcription. As a result, fluctuations in RNA-processing adds to the 
fluctuations due to bursty transcription. 

Figure 3 shows the Fano factors of the RNA substrates at fixed transcription rate 
as a function of the rate of transcription activation in the absence (dashed blue lines) 
and in the presence (solid brown lines) of enzymatic fluctuations. As expected, Fano 
factors in the absence of enzymatic fluctuations collapse to unity in the limit of strong 
transcription activation [9], whereas in the presence of slow enzymatic fluctuations we 
recover the result of Figure 1. Additionally, fluctuations in the number of processed RNA 
increase as the rate of transcription activation is reduced and transcription becomes more 
irregular. However, the signature of RNA-processing remains prominent throughout the 
range of transcription activation values. This is despite the conservative assumption in 
our numerical simulations that enzyme copy-number fluctuations is Poissonian. If RNA- 
processing enzymes are produced in bursts as well [10], the effect on RNA-processing 
would be stronger than what Figures 1,3 show. 

2.11. Fluctuations in RNA-processing impact on protein biogenesis 

Protein production takes place in bursts due to multiple translation events of mRNA 
transcripts even when transcription is constitutive and the mRNA biogenesis is a birth- 
death process of constant rates [9]. Introducing RNA enzyme fluctuations in the mRNA 
biogenesis is expected to render protein production even more noisy. Here, we investigate 
within our conservative framework how much fluctuations in protein production increase 
due to fluctuations in the copy-numbers of the RNA-processing enzymes. 

Figure 4 shows the Fano factors of pre- mRNA [x), mRNA (y) and protein (z) as 
functions of the gene transcription rate in the absence (dashed blue lines) and presence 
(solid brown lines) of spliceosome copy-number fluctuations. Fano factors of the RNA 
substrates collapse to unity as expected in the absence of enzymatic fluctuations, and 
the protein Fano factor becomes 1 -|- {kz/dy)/{l + d^/dy) = 3.5 [9]. In the presence 
of RNA-processing enzyme copy-number fluctuations however, we find that the protein 
biogenesis is significantly affected and protein noise becomes linearly dependent now 
on the transcription rate. We stress again that these results are conservative because 
we model Poissonian RNA-processing enzyme fluctuations. Additionally, copy-number 
fluctuations of the ribosomal units are not included in the simulation. Relaxing any of 
these two conditions would only amplify the strength of the effect shown in Figure 4. 
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3. Conclusion 

RNA biogenesis is a multi-step cascading process for protein-encoding transcripts and 
sRNAs alike. The output of one process becomes the input of the next one until the final 
mature product is reached. We showed that when reactions in the RNA cascades occur 
at constant rates, then the mature products undergo single-step birth-death biogeneses 
of constant effective rates and are Poisson-distributed at steady-state. This simple 
picture breaks down when there are fluctuations in the copy-numbers of enzymes that 
mediate RNA-processing. Enzymatic fluctuations induce fluctuations in the processing 
rates per corresponding substrate molecule. We showed that Fano factors of the RNA 
cascade's products increase linearly with the transcription rate, irrespective of the form 
of the steady-state distribution of the copy-numbers of enzymes participating in the 
processing steps. In numerical simulations that include transcriptional and translational 
bursting we find that this effect remains detectable in the presence of other sources of 
fluctuations, especially for the case of protein biogenesis. 

Post-transcriptional regulation is a significant part of sRNA biogenesis. Mature 
sRNAs are recruited by Argonaute proteins in order to form the RNA-induced silencing 
complexes (RISCs) [i]. RISCs are the units that mediate PTR, but also ensure 
the stability of sRNAs, rendering them important elements of the sRNA biogenesis. 
Additionally, early sRNA processing steps can be affected in numerous ways via 
feedback regulation. One example is miR162 regulating DCLl; further examples include 
miR168 regulating the Argonaute protein AGOl, and miR403 regulating AG02 [8]. 
Furthermore, there is abundant evidence of feedback regulation to sRNA genes by their 
transcription factor targets [3n]. All of this dynamics remained outside the scope of our 
analysis, as it involves PTR rather than the sRNA biogenesis. However, if feedback 
regulates the transcription of sRNA only, then the conclusions of our analysis are still 
applicable. If enzyme fluctuations are negligible, one can replace the sRNA cascade with 
a single birth-death process and incorporate the feedback regulation into the sRNA birth 
process. If on the other hand, like the case of miR162 and DCLl, feedback regulation 
takes place in sRNA processing, then clearly the processing details matter and need to be 
included. This can be the topic of a future study of feedback dynamics within the RNA- 
processing cascade. Finally, many RNA subclasses are not mentioned in this work. For 
example, piwi-interacting RNAs, small nucleolar RNAs, or small nuclear RNAs, all have 
distinct biogeneses and functionalities [39]. If these RNA subclasses follow processing 
chains like (5), or the more general (10), then our results are applicable for them also. 

Single nucleotide polymorphisms affect the processing rate of precursors of 
sRNAs and impact on the expression level of the mature products [27]. Naturally, 
polymorphisms affecting transcription in sRNA induces also variation across species in 
the production of mature sRNAs. However, based on our analysis of the steady-state 
expression levels in the sRNA biogenesis, we predict that the effect of transcriptional 
variation can be enhanced, reduced, or even reversed by the presence of variation in 
sRNA processing. 
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Our analysis showed that the RNA cascade amphfies the noise in the expression 
level of a mature RNA product provided the mature product is more stable than its 
precursors (as is generally the case). However, we showed it is also possible for the RNA 
cascade to reduce the noise in the expression level of unusually unstable mature RNA 
products, or of those mature products with whose precursors are processed at low rates. 

In summary, we investigated the RNA processing cascade and found universal 
characteristics in the steady-state dynamics for different RNA species. If processing 
steps take place at constant rates, then the mature RNA biogenesis can be modeled at 
steady-state as a single constant rate birth-death process. Variation in the processing 
rates induces additional fluctuations to the RNA cascade and the single birth-death 
picture breaks down. Finally, we showed that polymorphisms in the processing rates 
can act synergistically or antagonistically to polymorphisms in the transcription rates 
of RNA. Our work offers a framework to better understand the dynamics of RNA 
biogenesis. 
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Figure 1. Strength of fluctuations in RNA biogenesis depend linearly 
on the RNA transcription rate. The Fano factors of x, y, and z in (10) 
are plotted as functions of the transcription rate kx- The parameter values 
used are typical for the siRNA biogenesis in Salmonella [M]: kyA = k^B = 
d-x = dy = dz =l/h, A = B = 500. In the main plot the rate k of the 
biogeneses of the RNA-processing enzymes is varied. Blue lines correspond to 
K = 0.1/h, brown lines correspond to k = 1/h, and red lines to k = 10/h. 
The horizontal black line indicates the range of unity Fano factors predicted 
by (6) for Poissonian statistics. Variation in enzyme copy-numbers induces 
fluctuations in the substrates, whose amplitude depends linearly on the RNA 
transcription rate. For k > 0.1/h and for kx ^ 1/min we find F^ < Fx < Fy-. 
the RNA cascade buffers the fluctuations in z. However in the inset, we plot 
substrate Fano factors for twice as stable mature RNA (dz = 0.5/h) and for 
K = 0.1/h and find Fx ^ Fy < F^ (blue lines): the RNA cascade amplifies 
the strength of fiuctuations in z. If on the other hand, the average processing 
rate of this stable mature RNA is also reduced {kzB = 0.1/h), the relation 
Fx '^ Fy > Fz is restored (red lines) and the RNA cascade buffers again the 
fluctuations in z. 
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Figure 2. Autocorrelation function of the mature product in the RNA 
cascade. The normaUzed autocorrelation function of the mature product z is 
evaluated at steady-state for different time-points and ensemble-averaged over 
10^ realizations of the system. Identical parameters as in Figure 1 are used. 
The transcription rate of x is set to k^ = 20/min. We plot Cz{t)/Cz{^) for 
(i) the simple cascade (5) (red solid line), (ii) a single birth-death process with 
identical z-degradation rate as in (i) and z-creation rate equal to the average 
creation rate in (i) (dark-blue dashed line), and (iii) the full cascade (10) with 
fluctuating enzyme numbers {k = 0.1/h) and otherwise identical parameters as 
in (i) (light-blue dash-dotted line). At constant rates, there is no distinction 
in the dynamics between (i) and (ii). In the presence of enzyme fluctuations 
correlations persist over longer times of the order of 1/ k. 
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Figure 3. RNA processing with transcription bursts. Fano factors of 
the RNA substrates x, y, z in the chain (10) are plotted as functions of the rate 
of transcription activation kon- The rate of transcription inactivation is fixed 
at fcoff = 0. l/min [37]. When transcription is active the transcription rate is 
kx = 60/min. The remainder of the parameters are identical to Figure 1. In 
the absence of enzyme copy-number fluctuations (k = cx), dashed blue lines) 
and when /son ~ ^off transcription bursting leads to non-Poissonian statistics. 
This deviation from Poisson statistics is further enhanced by fluctuations in 
RNA-processing downstream of transcription (n = 0. l/min, solid brown lines). 
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Figure 4. Effect of RNA processing on protein biogenesis. Fano factors 
of pre-mRNA (x), mRNA (y) and protein (z) are plotted as functions of the 
gene transcription rate kx in the absence (k = cxj, dashed blue lines) and 
in the presence (k = 0.1/min, solid brown lines) of spliceosome copy-number 
fluctuations. The translation rate per mRNA molecule is fixed at k^ = 5/h and 
the rest of parameters are identical to Figure 1. In the absence of spliceosome 
fluctuations the statistics of x, y is Poissonian, but bursts of translation 
lead to non-Poissonian statistics of z with a Fano factor Fz independent of 
kx [9]- When spliceosome copy-number fluctuations are included, fluctuations 
in protein production increase linearly with kx- Our results are conservative 
since fluctuations in ribosome copy-numbers are ignored, and spliceosome 
fluctuations are taken to be Poissonian. 



